Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly cancel GRPC context beneath the HTTP server #3443

Merged
merged 7 commits into from
Mar 12, 2024

Conversation

joe-elliott
Copy link
Member

@joe-elliott joe-elliott commented Feb 29, 2024

What this PR does:
Currently if a GRPC streaming query or HTTP query exceeds the HTTP write timeout when passing through the HTTP server it continues until completion. i.e. context is not correctly cancelled. In addition this unhelpful error message is presented in Grafana:

rpc error: code = Internal desc = stream terminated by RST_STREAM with error code: INTERNAL_ERROR

This PR correctly cancels context for both the gRPC and HTTP frontend endpoints. It updates our dskit dependency to access a new http timeout middleware and uses gRPC interceptor to cancel calls to the streaming querier.

Concerns:

  • The http and gRPC timeout are enforced in very different places due to the nature of the way these options are configured.
  • I don't like the config name api_timeout. Any suggestions?

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Joe Elliott <[email protected]>
@joe-elliott joe-elliott force-pushed the cancel-grpc-context branch from d251d7f to b0346e5 Compare March 7, 2024 21:08
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Copy link
Contributor

@zalegrala zalegrala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. Nice use of the interceptor.

@joe-elliott joe-elliott merged commit eb81b92 into grafana:main Mar 12, 2024
15 checks passed
joe-elliott added a commit to joe-elliott/tempo that referenced this pull request Mar 19, 2024
* cancel context

Signed-off-by: Joe Elliott <[email protected]>

* update dskit

Signed-off-by: Joe Elliott <[email protected]>

* focused timeouts

Signed-off-by: Joe Elliott <[email protected]>

* docs

Signed-off-by: Joe Elliott <[email protected]>

* lint N docs

Signed-off-by: Joe Elliott <[email protected]>

* more lint

Signed-off-by: Joe Elliott <[email protected]>

* make update-mod

Signed-off-by: Joe Elliott <[email protected]>

---------

Signed-off-by: Joe Elliott <[email protected]>
joe-elliott added a commit that referenced this pull request Mar 20, 2024
* log request

Signed-off-by: Joe Elliott <[email protected]>

* move stuff a bit

Signed-off-by: Joe Elliott <[email protected]>

* oh my. e2e tests pass

Signed-off-by: Joe Elliott <[email protected]>

* add handlers

Signed-off-by: Joe Elliott <[email protected]>

* streaming tags

Signed-off-by: Joe Elliott <[email protected]>

* add cli support

Signed-off-by: Joe Elliott <[email protected]>

* improve logging

Signed-off-by: Joe Elliott <[email protected]>

* fix

Signed-off-by: Joe Elliott <[email protected]>

* docs

Signed-off-by: Joe Elliott <[email protected]>

* pipe overrides

Signed-off-by: Joe Elliott <[email protected]>

* cleanup

Signed-off-by: Joe Elliott <[email protected]>

* cleanup

Signed-off-by: Joe Elliott <[email protected]>

* support limits

Signed-off-by: Joe Elliott <[email protected]>

* docs

Signed-off-by: Joe Elliott <[email protected]>

* e2e tests and caching

Signed-off-by: Joe Elliott <[email protected]>

* key prefixes

Signed-off-by: Joe Elliott <[email protected]>

* cache keys

Signed-off-by: Joe Elliott <[email protected]>

* Fixed distinct collection in combiners

Signed-off-by: Joe Elliott <[email protected]>

* fixed combiner bugs and revived tests

Signed-off-by: Joe Elliott <[email protected]>

* restored all tests

Signed-off-by: Joe Elliott <[email protected]>

* lint

Signed-off-by: Joe Elliott <[email protected]>

* made search handler utilities generic

Signed-off-by: Joe Elliott <[email protected]>

* Added handler tests for tags

Signed-off-by: Joe Elliott <[email protected]>

* add diff support

Signed-off-by: Joe Elliott <[email protected]>

* lint

Signed-off-by: Joe Elliott <[email protected]>

* add distinct value collector tests

Signed-off-by: Joe Elliott <[email protected]>

* fix integration tests

Signed-off-by: Joe Elliott <[email protected]>

* diff tests

Signed-off-by: Joe Elliott <[email protected]>

* swapped query for the more robust ExtractMatchers(query)

Signed-off-by: Joe Elliott <[email protected]>

* tests

Signed-off-by: Joe Elliott <[email protected]>

* moved e2e tests to a more sensible place

Signed-off-by: Joe Elliott <[email protected]>

* fix non-deterministic  test

Signed-off-by: Joe Elliott <[email protected]>

* changelog

Signed-off-by: Joe Elliott <[email protected]>

* fix tests for 429 handling

Signed-off-by: Joe Elliott <[email protected]>

* Update docs/sources/tempo/operations/tempo_cli.md

Co-authored-by: Mario <[email protected]>

* review

Signed-off-by: Joe Elliott <[email protected]>

* Update docs/sources/tempo/api_docs/_index.md

Co-authored-by: Kim Nylander <[email protected]>

* Correctly cancel GRPC context beneath the HTTP server (#3443)

* cancel context

Signed-off-by: Joe Elliott <[email protected]>

* update dskit

Signed-off-by: Joe Elliott <[email protected]>

* focused timeouts

Signed-off-by: Joe Elliott <[email protected]>

* docs

Signed-off-by: Joe Elliott <[email protected]>

* lint N docs

Signed-off-by: Joe Elliott <[email protected]>

* more lint

Signed-off-by: Joe Elliott <[email protected]>

* make update-mod

Signed-off-by: Joe Elliott <[email protected]>

---------

Signed-off-by: Joe Elliott <[email protected]>

* Bump anchore/sbom-action from 0.15.8 to 0.15.9 (#3476)

Bumps [anchore/sbom-action](https://github.com/anchore/sbom-action) from 0.15.8 to 0.15.9.
- [Release notes](https://github.com/anchore/sbom-action/releases)
- [Commits](anchore/sbom-action@v0.15.8...v0.15.9)

---
updated-dependencies:
- dependency-name: anchore/sbom-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Doc update (#3482)

* doc: remove reference to previously purged script

* doc: correct label for docs updates

* [TraceQL Metrics] Use new per-tenant max_metrics_duration and fix duration check (#3484)

* Use new per-tenant max_metrics_duration, and fix duration timestamp handling

* Update docs and defaults

* Handle prefixes when listing blocks from S3 and GCS (#3466)

* Handle prefixes when listing blocks from S3

fixes #3465

* Handle prefixes when listing blocks from GCS

* Add test for prefixes when listing blocks from Azure

* Update unit tests to check for actual block IDs instead of just length of the slices

Cleanup unit tests

* Further refine S3/GCS backend for ListBlocks

Brings logic more in line with Azure object parsing.
Also has the benefit of handling prefixes without a trailing slash.

* Update poller integration test to exercise prefixes

* Update e2e test to exercise prefixes

* Fix format check error

* Fix failing e2e tests

* Remove unnecessary prefix permutations from e2e test

* Remove unnecessary test config file copy

* Ignore lint

---------

Co-authored-by: Zach Leslie <[email protected]>

* Update doc-validator.yml (#3483)

Updates the doc-validator to the latest version. Note that this changes the reference format to use the full URL (https://....) instead of /docs/blah

* [DOC] Document Tempo Operator Monolithic mode (#3474)

* [DOC] Document Tempo Operator Monolithic mode

Signed-off-by: Andreas Gerstmayr <[email protected]>

* clarify supported storages

Signed-off-by: Andreas Gerstmayr <[email protected]>

* fix case of title

Signed-off-by: Andreas Gerstmayr <[email protected]>

* Apply suggestions from code review

* Apply suggestions from code review

---------

Signed-off-by: Andreas Gerstmayr <[email protected]>
Co-authored-by: Kim Nylander <[email protected]>

* [DOC] document Grafana data source setup using Grafana and Tempo operators (#3473)

* [docs] document Grafana data source setup using Grafana and Tempo operators

* move the Grafana data source setup page to the operator folder (this
  page is only relevant for the operator)
* document Grafana data source setup using Grafana and Tempo operators

Signed-off-by: Andreas Gerstmayr <[email protected]>

* Apply suggestions from code review

---------

Signed-off-by: Andreas Gerstmayr <[email protected]>
Co-authored-by: Kim Nylander <[email protected]>

* [DOC] fix typo in setup/operator/monolithic.md (#3496)

Signed-off-by: Andreas Gerstmayr <[email protected]>

* Bump github.com/prometheus/client_golang from 1.18.0 to 1.19.0 (#3455)

* Bump github.com/prometheus/client_golang from 1.18.0 to 1.19.0

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.18.0 to 1.19.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.18.0...v1.19.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update serverless gomod

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: grafanabot <[email protected]>

* Add support for dashes, quotes and spaces in attribute names (#3458)

* Add support for dashes, quotes and spaces in attribute names

* chlog

* [TraceQL Metrics] Step align query_range time range (#3490)

* Step align query_range time range

* Time range error: improve message and fix format for prom format.

* oops remove printlns

* lint

* changelog

* 2.4.1 changelog (#3503)

Signed-off-by: Joe Elliott <[email protected]>

* [DOC] Add 2.4.1 release notes (#3504)

* fix tests due to interface changing

Signed-off-by: Joe Elliott <[email protected]>

* Pass context

Signed-off-by: Joe Elliott <[email protected]>

---------

Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Andreas Gerstmayr <[email protected]>
Co-authored-by: Mario <[email protected]>
Co-authored-by: Kim Nylander <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matt Simonsen <[email protected]>
Co-authored-by: Martin Disibio <[email protected]>
Co-authored-by: Ben Foster <[email protected]>
Co-authored-by: Zach Leslie <[email protected]>
Co-authored-by: Andreas Gerstmayr <[email protected]>
Co-authored-by: grafanabot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants